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§1  ACCOMPLISHMENTS 

Feature  sizes  of  semiconductor  devices  are  expected  to  shrink  to  70  nm  in  the  next  10  years  in 
order  to  achieve  higher  processing  speed  and  memory  density.  In  the  sub-0. 1 -pm  regime,  a  number 
of  critical  issues  require  atomistic  modeling.  First,  macroscopic  material  properties  often  do  not 
hold  at  the  nanometer  scale.  For  example,  a  Si02  film  no  longer  acts  as  an  insulator  with  thickness 
less  than  0.7  nm.  In  order  to  reduce  the  leakage  currents  in  gate  dielectric  films,  higher-permittivity 
materials  are  being  sought  to  replace  Si02.  These  so  called  high-A  materials  include  silicon 
oxynitrides  and  Ti02  among  others.  Another  critical  issue  is  inhomogeneous  nature  of  structural, 
mechanical,  and  vibrational  properties  at  atomic  scales  in  lattice-mismatched  alloys  such  as 
GaAs/InAs,  which  must  be  taken  into  consideration  for  mechanical  and  thermal  stability  of 
nanoelectronic  structures. 

Stress  also  plays  a  critical  role  in  the  fabrication  of  nanoscale  devices  by  interfacet  adatom 
migration  on  patterned  surfaces  and  lattice-mismatch-stress  induced  growth  of  coherent  3D  islands. 
Atomistic  simulations  are  expected  to  guide  the  design  of  optimal  surface  patterns  for  nanodevice 
fabrication. 

Physical  properties  of  nanometer-size  structures  often  exhibit  significant  size  dependence. 
Examples  include  phase  stability  and  oxidation  reaction.  Variation  of  physical  properties  with  size 
could  provide  methods  for  designing  devices  with  controlled  properties. 

During  the  three  and  a  half  years  of  this  project,  large-scale  (10-100  million  atoms)  parallel 
molecular  dynamics  (MD)  simulations  have  been  performed  to  study  a  number  of  physical 
properties  and  processes  in  nanometer-size  structures: 

1.  Fracture  in  GaAs  thin  films  (§1.1); 

2.  Structural  transformation  in  GaAs  nanocrystals  (§1.2); 

3.  Structural,  mechanical,  and  vibrational  properties  of  GaAs/InAs  alloys  (§  1 .3); 

4.  Inhomogeneous  stress  distribution  in  Si/Si3N4  nanopixels  (§1.4); 

5.  Oxidation  dynamics  of  aluminum  nanoclusters  (§  1 .5); 

6.  Dielectric  properties  of  high-A  Ti02  (§  1 .6). 

These  large-scale  atomistic  simulations  have  been  enabled  by  combining  novel  parallel 
multilevel  algorithms  and  new  simulation  methodologies  including  a  variable-charge  MD  approach, 
see  §1.7. 

These  research  activities  resulted  in  46  refereed  publications  (see  §6). 

§1.1  Fracture  in  GaAs 

Energetics  of  surface  microstructures  (steps,  kinks,  and  mesas)  is  of  essential  importance  for  the 
fabrication  of  semiconductor  devices.  Surface  energies  of  GaAs  and  other  semiconductors  have 
been  derived  from  fracture  experiments.  Fracture  energy  measured  in  these  experiments,  however, 
involves  various  dissipation  terms  (such  as  dislocation  emission  and  void  nucleation)  in  addition  to 
the  energy  for  cleavage  debonding. 

A  fundamental  quantity  involved  in  fracture  is  the  energy  release  rate,  G — the  rate  at  which 
the  mechanical  energy  stored  in  the  elastic  body  flows  into  the  crack  tip  per  unit  crack  advance.  It  is 
the  energy  to  create  a  new  surface  of  unit  area  during  fracture  process,  and  consists  of  the  energy 
for  cleavage  debonding  and  any  associated  energy  dissipations.  Gc  is  the  minimum  energy  at  which 
crack  propagation  is  possible,  and  it  is  an  intrinsic  material  property  that  characterizes  toughness. 

Molecular-dynamics  simulations  involving  up  to  100  million  atoms  have  been  performed  to 
study  fracture  in  GaAs  thin-film  strips,  see  Fig.  1.  Significant  orientation  dependence  of  fracture  is 
observed  (Fig.  2).  (1 10)  fracture  is  cleavage-like  and  is  characterized  by  Gc=  1.4  ±  0.1  J/m2.  The 
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calculatedcriti  cal  energy  release  rate  agrees  well  with  availableexperimentaldata,  1.5  -  1.7  J/m2.  In 
contrast  to  the  cleavage-like(l  10)  fracture, (1 1 1)  fracture  involvesslip  in  the  (1 1 1)  plane,  and  crack 
branching  is  observed  in  (001)  fracture.  Accordingly,  these  surfaces  are  characterized  by  larger 
toughness  values: Gc  =  1.7  J/m2  for  (111)  and  Gc  =  2.0  J/m2  for  (001). 


°zx(GPa) 

Fig.  1:  Shear  stress  distribution  near  a  crack  tip  in  a  GaAs  thin-film  strip  consisting  of  100-million  atoms. 


Fig.  2  :  Atomic  configurations  near  crack  tips  in  GaAs  for  (110),  (111),  and  (001)  fracture  surfaces. 

§1.2  Structural  Transformation  in  Gallium  Arsenide  Nanocrystals 

Despite  numerous  experimentaland  theoreticalstudies,  structural  transformations  in  GaAs  and  SiC 
at  high  pressures  are  not  well  understood  at  the  atomistic  level.  We  have  investigated  the 
mechanisms  of  these  transformations  using  an  isothermal-isobaric  MD  approach  and  new 
interatomic  potential  schemes.  In  SiC,  a  reversible  transformation  between  the  four-fold 
coordinated  zinc-blende  structure  and  the  six-fold  coordinated  rocksalt  structure  is  found  at  a 
pressure  of  100  GPa.  The  calculatedvolume  change  at  the  transition  and  the  hysteresis  are  in  good 
agreement  with  experimental  data.  The  atomistic  mechanism  for  the  structural  transformation  is  a 
cubic-to-monoclinicunit-celltransformationandarelativeshiftof  Si  and  C  sublattices  in  the  [100] 
direction.  The  new  transition  path  does  not  involveany  bond  breaking  and  it  has  a  significantly 
lower  activationenergy  compared  with  a  previously  proposed  transformationmechanism(Fig.  3). 

In  crystalline  GaAs,  the  pair-distribution  functions  and  bond-angle  distributions  indicate 
that  the  four-fold  coordinated  zinc-blende  structure  transforms  into  a  six-fold  coordinated 
orthorhombic  structure  at  a  pressure  of  22  GPa.  The  reverse  transformationfrom  the  orthorhombic 
to  zinc-blende  structure  shows  hysteresis  and  is  observed  at  a  pressure  of  ~  10  GPa.  These  MD 
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results  are  in  good  agreementwith  experiments  (Fig.  4).  The  calculatedvolumechange  of  ~  16%  is 
also  in  agreementwith  experiments. 


Fig.  3  :  The  newly  proposed  transition 
path  for  the  pressure-inducedzinc-blende-to- 
rocksalt  transformation  in  SiC  involves  a 
relative  shift  of  Si  and  C  sublattices  in  the 
[100]  direction.  It  has  a  small  activation 
enthalpy  (0.03  eV/atom). 


Fig.  4:  Comparison  between  (a)  MD  and  (b)  EXAFS  experimenta 
results  [J.  M.  Besson,  J.  P.  Itie,  A.  Polian,  et  al Phys.  Rev.  B  44 . 
4214  (1991)]  for  the  Ga-As  nearest-neighbor  distance  during  forwan 
(squares) and  reverse  (circles)  structural  transformations. 
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Aggregates  of  nanometer-size 
semiconductor  crystals  have 
promising  applications  as 
photovoltaics,  light-emitting  diodes, 
and  single-nanocrystal,  single¬ 
electron  transistors.*  We  have 
investigated  pressure-induced 

structural  transformations  in  GaAs 
nanocrystals  of  different  sizes  using 
parallel  MD  simulations.  It  is  found 
that  the  transformationfrom  four-fold 
(zinc  blende)  to  six-fold  coordination 
starts  at  the  surfaces  of  nanocrystals 
and  proceeds  inwards  with  increasing 
pressure.  Inequivalent  nucleation  of 
the  high-pressure  phase  at  different 
sites  leads  to  an  inhomogeneous 
deformation  of  the  nanocrystal.  For 
sufficiently  large  spherical 
nanocrystals,  this  gives  rise  to 
rocksalt  structures  of  different 
orientations  separated  by  grain 

boundaries,  see  Fig.  5.  The  absence  of  such  grain  boundaries  in  a  faceted  nanocrystal  of  moderate 
size  indicates  sensitivity  of  the  transformation  to  the  initial  nanocrystal  shape.  The  pressure 
corresponding  to  the  complete  transformation  increases  with  the  nanocrystal  radius  and  it 
approaches  the  bulk  value  for  a  spherical  nanocrystal  of  ~  5,000  atoms. 


V  V  i  &■  t  t  : 


Fig.  5  :  Bottom  left  panel  shows  color-coded  grain  structures  in  a 
GaAs  nanocrystal  at  a  pressure  of  22.5  GPa.  Bottom  right  panel  and 
insets  show  differentgrain  boundary  types. 


*J.  N.  Wickham,  A.  B.  Herhold,andA.  P.  Alivisatos,  Phys.  Rev.  Lett.  84, 923  (2000). 
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§1.3  Structural,  Mechanical,  and  Vibrational  Properties  of  GaAs/lnAs 
Alloys 

Reliable  interatomic  potential  is  an  essential  ingredient  of  MD  simulations  to  investigate  the  atomic- 
level  surface  stresses,  surface  stoichometry  and  migration  processes,  and  the  mechanical  stresses  in 
lattice-mismatched  InAs/GaAs  systems.  A  significant  step  forward  was  taken  in  developing  an 
interatomic  potential  for  GaAs  and  InAs.  The  potential  was  validated  against  experimental  data  and 
first-principles  bulk  electronic  structure  calculations  on  pressure-induced  structural  transformations, 
melting/recrystallization,  and  the  cleavage  energy.  Phonon  dispersion  of  GaAs  also  agrees  well 
with  experimental  data  (Fig.  6).  The  MD  results  for  the  LO  (longitudinal  optical)-TO  (transverse 
optical)  phonon  energy  splitting  at  the  T  point,  2.85  meV  for  GaAs  and  2.46  meV  for  InAs,  are  also 
in  reasonable  agreement  with  experimental  data  (2.70  meV  for  GaAs  and  2.46  meV  for  InAs, 
respectively). 

In  addition  to  describing  bulk  crystal  properties,  the  interatomic  potentials  provide  excellent 
representation  of  amorphous  systems.  For  example,  static  structure  factor  of  amorphous  GaAs  is 
in  excellent  agreement  with  neutron  scattering  data,  see  Fig.  7. 


Fig.  6  :  Theoretical  and  experimental  [D.  Strauch  and  B.  Dorner,  J.  Phys.:  Condens.  Matter  2,  1457  (1990)] 
phonon  dispersion  of  zinc-blende  GaAs. 


Fig.  7:  X-ray  static  structure  factor  of  amorphous  GaAs  at  density  5.11  g/cm3.  The  solid  curve  and  circles 
represent  MD  and  experimental  [D.  Udron  etal.,  J.  Non-Cryst.  Solids  137&138,  131  (1991)]  data,  respectively. 

In  order  to  study  InAs/GaAs  mesas,  we  need  to  develop  an  interatomic-potential  scheme  for 
mixed  InAs/GaAs  systems.  Recently,  we  have  developed  a  scheme  to  combine  interatomic 
potentials  of  binary  materials  in  such  a  way  that  the  resulting  potential  depends  on  the  local 
chemical  environment.  For  systems  involving  Ga,  In,  and  As,  we  use  a  linear  mixing  scheme  to 
combine  the  interatomic  potentials  for  GaAs  and  InAs.  This  scheme  is  adaptive  in  which  As  atoms 
are  classified  into  different  types  according  to  the  number  of  Ga  and  In  neighbor  atoms. 
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To  validate  this  scheme,  we  have  performed  MD  simulations  to  calculate  the  structural 
correlation  in  Gal  tInAAs  alloy.  Both  GaAs  and  InAs  form  the  zinc-blende  structure  at  low 
pressures.  Due  to  the  lattice  mismatch,  the  volume  of  Ga,^  hy\s  alloy  is  an  increasing  function  of 
x  However,  both  Ga-As  and  In- As  bond  lengths  change  very  little  as  a  function  of  x,  as  observed  in 
the  EXAFS  (extended  x-ray  absorption  fine  structure)  experiments  in  Fig.  8.  These  behaviors  of 
the  bond  lengths  are  well  reproduced  in  our  MD  simulations,  see  Fig.  8. 

In  pair  distribution  functions  for  cation-cation  and  As-As  pairs  obtained  from  MD 
simulations,  the  nearest  cation-cation  distance  has  a  broad  distribution,  whereas  the  nearest  As-As 
peak  splits  to  into  two  peaks  suggesting  the  existence  of  two  As-As  distances.  This  is  in  good 
agreement  with  experiments,  and  demonstrates  that  the  second-neighbor  correlation  in  the  alloy  is 
accurately  described  by  MD  simulations. 


EXAFS  Experiment  Theory 


Fig.  8:  (Left)  EXAFS  experimental  data  on  Ga-As  and  In- As  nearest  neighbor  distance  as  a  function  of  alloy 
composition  [J.  C.  Mikkelsen  and  J.  B.  Boyce,  Phys.  Rev.  Lett.  49,  1412  (1982)].  Middle  curve  is  the  virtual- 
crystal-approximation  bond  length  calculated  from  the  measured  X-ray  lattice  constant.  (Right)  The  MD  simulation 
results  for  the  same  quantities. 

We  have  used  the  new  interatomic  potential  scheme  to  calculate  the  elastic  constants  of 
Ga^In^As  alloy.  The  elastic  constants  have  been  found  to  exhibit  a  significant  nonlinear 
dependence  on  the  composition,  x,  see  Fig.  9.  We  have  also  studied  phonon  density-of-states  of 
Ga^In^As  (Fig.  9).  For  intermediate  x,  phonon  density-of-states  exhibits  a  two-mode  behavior 
associated  with  optical  modes  of  GaAs  and  InAs,  in  agreement  with  recent  experimental  results  [J. 
Groenen  et  al,  Phys.  Rev.  B  58, 10452  (1998)]. 


0  10  20  30  40  50 

Energy  (meV) 


Fig.  9:  (Left)  Elastic  constants  (C,„  Cn,  and  C14)  and  bulk  modulus  (B)  of  Ga,.Jln^As  alloy.  (Right)  Phonon 
density-of-states  of  the  same  system. 
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§1.4  Stress  in  Si/S i3N4  Nanopixels 

We  haveinvestigatedthe  effects  of  surfaces,  edges,  and  latticemismatch  at  the  Si(  1 1  l)/Si3N4(0001) 
interfaceon  the  atomic-levelstress  distribution  in  a  Si/Si3N4  nanopixel. 

A  10-million-atomMD  has  been  performed  on  128  CPUs  of  the  HP  Examplarat  Caltech. 
The  system  consists  of  a539Ax32AxlOA  Si  mesa  which  is  placed  on  top  of  a  1077Ax653Ax300A 
Si(lll)  substrate.  The  top  surface  of  the  mesa  is  covered  with  an  83A-thick  crystalline 
Si3N4(0001)  film  (Fig.  10).  The  Si(lll)/Si3N4(0001)  interface  involves  a  1.1%  latticemismatch 
inducing  a  stress  in  the  system.  The  latticemismatch  causes  compressive  stresses  in  Si3N4,  while  a 
tensile  stress  is  observed  in  Si.  Note  that  edges  and  comers  act  as  stress  concentrators. 

We  have  also  simulated  a  nanopixel  covered  with  an  amorphous  Si3N4  film.  We  find  the 
stress  to  be  highly  non-uniform  laterally,  quite  unlike  those  for  crystalline  films.  Such  lateral 
inhomogeneity  in  stress  is  expected  to  affect  the  processing  of  nanoscale  pixels. 


Fig.  10:  Stress  distribution  in  a  Si/Si  3N4  nanopixel  consisting  of  10-million  atoms  in  the  presence  of  the  1.1% 
lattice  mismatch.  To  show  the  stresses  inside  the  nanopixel  one  quarter  of  the  system  is  removed  and  the  value  of 
the  hydrostatic  stress  is  color-coded. 

§1.5  Oxidation  of  Aluminum  Nanoclusters 

Oxidation  is  important  as  a  dielectric-layerforming  process  in  microelectronics.  Oxidation  of 
nanometer-size  clusters,  however,  is  very  different  from  that  of  large,  flat  surfaces.  Aluminum 
nanoclusters  of  diameters  100-700  A  are  known  to  form  an  oxide  scale  with  thickness  between  20 
and  50  A  in  low-density  oxygen  gases  at  room  temperature.  Athigh  oxygen  densities,  however,  the 
reactionis  explosive, making  A1  nanoclusters  an  efficientrocketfuel. 

We  have  performed  the  first  MD  simulation  to  study  the  oxidation  dynamics  of  A1 
nanoclusters,  see  Fig.  11.  An  A1  nanocluster  of  radius  100A  was  placed  in  gaseous  oxygen.  The 
simulation  takes  into  account  the  effect  of  dynamic  charge  transfer  between  O  and  A1  on  the  basis 
of  the  electronegativitjequalizationprinciple.  Oxidation  starts  at  the  surface  of  the  cluster  and  the 
oxide  layer  grows  to  a  thickness  of  ~35  A.  The  surface  oxide  melts  because  of  the  high 
temperaturedue  to  the  release  of  the  Al-0  bonding  energy. 

We  have  also  performed  a  simulation  in  the  canonical  ensemble.  An  oxide  scale  of  40  A 
thickness  is  subsequently  formed,  as  seen  in  Fig.  12.  The  MD  simulations  provide  detailed  picture 
of  the  rapid  evolution  and  culmination  of  the  surface  oxide  thickness,  local  stresses,  and  atomic 
diffusivities.  In  the  first  5  ps,  oxygen  molecules  dissociate  and  the  oxygen  atoms  first  diffuse  into 
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octahedral  and  subsequently  into  tetrahedral  sites  in  the  A1  nanoparticle.  In  the  next  20  ps,  as  the 
oxygen  atoms  diffuse  radially  into  and  the  A1  atoms  diffuse  radially  out  of  the  nanoparticle,  the 
fraction  of  six-fold  coordinated  oxygen  atoms  drops  dramatically.  Concurrently,  there  is  a 
significantincrease  in  the  number  of  O  atoms,  forming  clusters  of  comer-sharing  and  edge-sharing 
OAl4  tetrahedra.  Between 30  and  35  ps,  clusters  of  OAl4  coalesce  to  form  a  neutral,  percolating 
tetrahedralnetwork  that  impedes  further  intrusion  of  oxygen  atoms  into  and  of  A1  atoms  out  of  the 
nanoparticle.  As  a  result,  a  stable  oxide  scale  is  formed.  Structural  analysis  reveals  a  40A  thick 
amorphous  oxide  scale  on  the  A1  nanoparticle.  The  thickness  and  structure  of  the  oxide  scale  are  in 
accordancewithexperimentalresults. 


Fig.  11:  Atomic  charges  during  the  oxidation  Fig.  12:  Atomic  charges  showing  an  A1  cluster  coveredwith  £ 
of  an  A1  nanocluster.  40  A-thick  oxide  scale. 


§1.6  Dielectric  Properties  of  High  -k Materials 

A  variable-charge MD  scheme  has  been  used  to  describe  dielectric  properties  of  high-&  Ti02. 
Titanium  dioxide  is  an  important  ceramicfor  a  variety  of  industrial  applications  that  stem  from  high 
polarizabilityand  large  static  dielectricconstant(~l  14  for  powdered  rutile  structure).  In  the  vicinitj 
of  interfaces,  Ti02  exhibits  significant  charge  fluctuations  which  is  evident  from  the  non- 
stoichiometricTi^O^  ,  (withx  ranging  from  4  to  10)  in  interfacial  regions.  Charge  transfer  among 
Ti  and  O  atoms  is  found  to  be  highly  sensitive  to  the  local  environment,  and  hence  the  variable 
charge  MD  is  essential  to  describe  its  dielectricproperties  correctly. 

Latticeconstant,  cohesive  energy,  and  elastic  moduli  calculatedwith  our  new  variable-charg< 
interatomicpotentialarein  good  agreementwithexperimentalresults.  The  interaction  potential  also 
describes  the  dielectric  properties  of  the  rutile  Ti02  correctly.  Static  dielectric  constants  are 
calculated  through  the  fluctuation-dissipation  theorem.  We  consider  a  periodically-repeatedunit 
cell,  obtain  its  independent  oscillatory  modes  by  diagonalizing  the  dynamical  matrix,  and  then 
calculate  the  electric  dipole  moment  using  the  law  of  equipartitionfor  thermal  equilibrium  among 
those  modes.  The  dielectricconstants  thus  calculatedincludethe  effects  of  ionic  motion  and  charge 
transfer  between  atoms.  (High-frequencey  electronic  polarization  effects  can  be  neglected,  since 

experimentalhigh-frequency  dielectricconstants,  =6.8  and  e”  =8.4,  are  much  smaller  than  their 
static  counterparts,^  =86  and  £^=170.)  Our  results  for  the  dielectricconstants,  =  92  and  = 
196  agree  well  with  experimental  measurements.  Pressure  dependence  of  the  dielectricconstants 
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has  also  been  studied.  The  calculated  slopes,  dlne^/dP  =  -0.09  GPa'1  and  dlneu/dP  =  -0.15 
GPa'1,  compare  favorably  with  the  experimental  results,  dlne^/dP  =  -0.05  GPa'1  and  d\n£u/ dP 
=  -0.12  GPa'1.  Furthermore,  the  calculated  directionally-averaged  dielectric  constant  of  the  anatase 
phase,  46,  agrees  well  with  an  experimental  value,  48.  Through  detailed  analyses  of  the  calculated 
electric-dipole  oscillations,  we  have  elucidated  the  characteristic  features  of  the  Ti-0  bonding  that 
play  an  important  role  in  dielectric  properties. 

§1,7  Multilevel  Algorithms  for  Parallel  Molecular  Dynamics  Simulations 

Molecular  dynamics  is  a  powerful  tool  for  the  atomistic  understanding  of  long-range  stress- 
mediated  phenomena,  phonon  properties,  and  mechanical  failure  of  nanostructures.  For  realistic 
modeling  of  nanostructures,  however,  the  scope  of  simulations  must  be  extended  to  larger  system 
sizes,  longer  simulated  times,  and  more  complex  realism  than  what  has  been  feasible  until  recently. 
Below  we  summarize  the  new  multilevel  algorithms  we  have  developed. 

Space-time  Multiresolution  Molecular  Dynamics 

The  most  prohibitive  computational  problem  in  MD  simulations  is  associated  with  the 
Coulomb  potential.  Our  MD  algorithm  is  based  on  multiresolutions  in  both  space  and  time.  The 
long-range  Coulomb  interaction  is  computed  with  the  fast  multipole  method  (FMM)  which  reduces 
the  computation  from  0(N2)  to  O(N)  for  ALatom  systems.  Multiple  time-scale  (MTS)  approach 
uses  different  time  steps  to  compute  forces  for  different  interatomic  separations.  This 
multiresolution  molecular  dynamics  (MRMD)  algorithm  has  been  implemented  on  parallel 
computers  using  spatial  decomposition. 

Multilevel  Preconditioning  for  Variable-charge  Molecular  Dynamics 

Conventional  interatomic  potential  functions  used  in  MD  simulations  are  often  fitted  to  bulk 
solid  properties,  and  they  are  not  transferable  to  systems  containing  defects,  cracks,  surfaces,  and 
interfaces.  Transferability  of  interatomic  potentials  is  greatly  enhanced  by  incorporating  variable 
atomic  charges  that  dynamically  adapt  to  the  local  environment.  Atomic  charges  can  be  determined 
by  equalizing  electronegativity.  A  multilevel  preconditioned  conjugate-gradient  method  is 
developed  for  this  problem  by  splitting  the  Coulomb-interaction  matrix  into  short-  and  long-range 
components.  The  sparse  short-range  matrix  is  used  as  a  preconditioner  to  improve  the  spectral 
property  of  the  linear  system  and  thereby  accelerating  the  convergence. 

Wavelet-based  Adaptive  Curvilinear-coordinate  Load  Balancing 

Simulation  of  nanostructures  is  often  characterized  by  irregular  atomic  distribution.  One 
practical  problem  in  simulating  such  irregular  systems  on  parallel  computers  is  that  of  load 
imbalance.  To  solve  this  load-imbalance  problem,  we  have  developed  a  dynamic  load-balancing 

scheme.  In  this  scheme,  we  introduce  a  curvilinear  coordinate  system,  g  ,  which  is  related  to  the 

atomic  coordinate,  x  ,  through  the  mapping,  |  =  x  +  u(x).  Workloads  are  partitioned  with  a 
uniform  3-dimensional  mesh  in  the  curvilinear  coordinate  system,  and  the  load-imbalance  and 
communication  costs  are  minimized  as  a  functional  of  u  (x).  Simulated  annealing  is  used  to  solve 
the  optimization  problem.  We  apply  multiresolution  analysis  based  on  wavelets  to  the  displacement 
field  u  (x)  for  a  compact  representation  of  abrupt  changes  in  partition  boundaries. 

Fuzzy-clustering  Approach  to  Hierarchical,  Long-time  Dynamics 

Many  important  material  processes  are  characterized  by  time  scales  that  are  many  orders-of- 
magnitude  larger  than  atomic  time  scales  (1015  sec).  A  new  algorithm  is  developed  for  large-scale, 
long-time  MD  simulations  by  combining  a  hierarchy  of  subdynamics.  Global  dynamics  is 
represented  by  the  quaternion  formulation  of  rigid-body  cluster  dynamics.  Fast  oscillation  of  each 
atom  around  the  local  potential  minimum  is  integrated  analytically  by  normal-mode  analysis.  The 
residual  dynamics  is  integrated  by  a  symplectic,  implicit  integration  scheme.  Fuzzy  clustering 
facilitates  seamless  integration  of  these  subdynamics.  The  fuzzy-body/implicit-integration/normal- 
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mode  (FIN)  scheme  sped  up  a  simulation  of  nanocluster  sintering  by  a  factor  of  28  over  a 
conventional  explicit  integration  scheme,  without  loss  of  accuracy. 

Octree-Based  Data  Compression 

A  serious  technological  gap  exists  between  the  growth  in  processor  power  and  that  of 
input/output  (I/O)  speed.  A  100-million-atom  MD  simulation  produces  5  GBytes  of  data  per 
frame,  and  a  successful  simulation  project  must  overcome  the  I/O  bottleneck.  The  large  data  size  is 
also  an  obstacle  for  checkpointing  simulation  states  for  fault  tolerance. 

Recently  we  have  developed  a  data-compression  scheme  for  scalable  I/O.  We  use  an  octree 
indexing  and  sort  atoms  accordingly  on  the  resulting  spacefilling  curve.  By  storing  the  difference 
of  successive  atomic  coordinates,  the  I/O  requirement  reduces  from  0(Mog/V)  to  0(N).  This, 
together  with  a  variable-length  encoding  to  handle  outliers,  reduces  the  I/O  size  from  50  to  6 
Bytes/atom  with  a  user-controlled  error  bound. 

All  of  our  multiresolution  algorithms  are  highly  scalable,  and  they  have  been  ported  to  Cray 
T3E,  SGI  Origin  2000,  HP  Exemplar,  and  IBM  SP  computers.  On  a  1,024-node  Cray  T3E,  we 
have  achieved  parallel  efficiencies  of  0.97  and  0.96,  respectively,  for  a  1.02-billion-atom  MD  and  a 
20.6-million-atom  variable-charge  MD  simulations,  see  Fig.  13. 


Fig.  13:  Scalablity  tests  of  (a)  classical  and  (b)  variable-charge  MD  algorithms  on  a  Cray  T3E  computer.  Total 
execution  (circles)  and  communication  (squares)  times  are  plotted  for  648,000P  atom  silica  systems  for  classical  MD 
and  20,160P  atom  alumina  systems  for  variable-charge  MD  on  P  processors  (P  =  1,  ...»  1,024). 


§2  RESEARCH  IN  PROGRESS 

In  this  section  we  briefly  describe  the  simulation  problems  we  are  currently  involved  in. 

§2.1  Stress  Distribution  in  In As/GaAs  Mesas 

The  large  (~  7%)  lattice  mismatch  and  associated  strain  at  InAs/GaAs  (001)  interfaces  have  recently 
been  utilized  to  fabricate  a  number  of  nanostructures.  It  is  well  known  that  the  strain  relief  leads  to 
the  formation  of  three-dimensional  island  structures  above  a  critical  amount  (~  2  monolayers)  of 
InAs  deposition.  When  InAs  is  deposited  on  a  <100>  oriented  GaAs  square  mesa  of  size  ~  75  nm, 
however,  the  critical  thickness  for  island  formation  is  enhanced  to  ~  11  ML  [A.  Konkar,  A. 
Madhukar,  and  P.  Chen,  Appl.  Phys.  Lett.  72,  220  (1998)].  Remarkably,  this  growth  is  “self- 
limiting”.  At  the  early  stage  of  InAs  deposition,  In  atoms  migrate  from  the  side  walls  to  the  mesa 
top,  realizing  the  substrate  encoded  size-reducing  epitaxy  (SESRE).  Once  the  InAs  thickness  on 
the  mesa  top  reaches  ~  1 1  ML,  however.  In  atoms  migrate  away  from  the  mesa  top.  A  possible 
mechanism  of  the  migration-direction  reversal  is  the  built-up  strain  energy  in  the  InAs  film  and  the 
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associated  change  in  the  stress-gradient  direction.  More  recently,  parallel  chains  of  InAs  islands 

havebeenfabricatedon  top  of  [1 10]  oriented  stripe  mesas  of  sub-100-nm  widths  on  GaAs  (001) 
substrates. 

Our  ongoing  simulation  effort  focuses  on  understanding  atomic  level  stress  distributions 
relevantto  the  self-limitednature  of  lattice-mismatchecbverlayergrowth  of  InAs  on  GaAs.  This  is 
of  considerable  significance  to  realizing  fault-tolerant, high  density,  two-  and  three-dimensiona 
arrays  of  quantum  boxes  for  applications  in  anticipated  new  paradigms  for  information  technology 
in  the  21st  century. 

Large-scale  MD  simulations  are  performed  to  investigate  the  mechanical  stresses  on 
InAs/GaAs  nanomesas  with  {101}-type  sidewalls,  see  Fig.  14.  The  in-plane  lattice  constant  of 
InAs  layers  parallelto  the  InAs/GaAs(001)  interface  starts  to  exceed  the  InAs  bulk  value  at  12  ML 
and  the  hydrostatic  stresses  in  InAs  layers  are  tensile  above  12th  ML.  As  a  result,  it  is  not  favorabk 
to  have  InAs  overlayers  thicker  than  12  ML.  This  may  explain  the  experimental  findings  of  the 
growth  of  flat  InAs  overlayers  with  self-limitingthickness  of  ~  11  ML  on  GaAs  nanomesas. 


Fig.  14 :  Atomic-level  hydrostatic  pressure  distribution  in  an  InAs/GaAs  squaremesa  with  a  12  ML  InAs  overlayer. 
Negative  pressure  means  tensile  and  positive  pressure  means  compressive. 

§2.2  Nanoindentation  in  Si3N4  Surfaces 

Nanoindentation  testing  is  a  unique  local  probe  of  mechanical  properties  of  materials  at  surfaces 
and  multilayeredstructures.  This  techniqueis  especially  useful  in  testing  surfaces  and  thin  films  in 
microelectronicsindustry.  For  example, it  is  used  for  measuring  critical  stresses  for  dislocation 
generationin  semiconductor  devices.  The  importance  of  atomic-levelunderstanding  of  indentation 
processes  is  widely  recognized. 

Using  MD  simulations,we  are  investigatingnanoindentationin  Si3N4.  The  nanoindentatior 
simulationis  performed  on  the  (0001)  surface  of  a  60  nm  x  60  nm  x  30  nm  crystallinea-Si3N4  slab 
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consisting  of  10  million  atoms,  see  Fig.  15.  The  sample  is  indented  using  a  pyramid  indenter  with  a 
load  ~  10  pN  and  indentation  depth  ~  10  nm.  From  the  load-displacementcurve,  hardness  value  is 
estimated  to  be  of  50.3  GPa.  (We  have  also  calculatedthe  hardness  of  amorphous  Si3N4  to  be  31.5 
GPa  using  a  similar  geometry.)  Our  simulations  reveal  significant  plastic  deformation  and 
pressure-induced  amorphization  under  the  indenter.  The  simulations  also  exhibit  anisotropic 

fracture  toughness:  Indentation  cracks  are  observed  along  the  [1210]  direction,  which  coincides 
with  one  of  the  diagonal  directions  of  the  indenter, but  not  for  the  other  diagonal  direction, [1 1 00] . 
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Fig.  15:  Pressure  distribution  in  Si3N4  during  (a)  loading  and  (b)  unloading  of  a  nanoindenter. 


Fig.  16:  Top  view  of  the  atomic  distribution  in  a  Si  3N4  (0001)  surface  after  unloading  of  a  nanoindenter. 

§2.3  Scalable  Molecular -Dynamics /Quantm -Mechanical  Algorithms 

Having  established  scalable  MD  algorithms,  current  focus  of  research  is  how  to  enhance  the 
physical  realism  of  these  simulations.  A  more  accurate  but  compute-intensive  method  explicitly 
deals  with  electron  wave  functions  and  their  mutual  interactions  in  the  framework  of  the  density 
functional  theory  (DFT)  and  electron-ion  interaction  using  pseudopotentials.  (The  DFT,  for  the 
development  of  which  Walter  Kohn  received  a  1998  Nobel  chemistry  prize,  reduces  the 
exponentially  complex  quantum  A-body  problem  to  a  self-consistent  0{ A3)  eigenvalue  problem.) 
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Our  DFT  code  is  based  on  the  minimization  of  an  energy  functional  in  the  local  density 
approximation  and  uses  a  higher-order  finite-difference  method  with  pseudopotentials.  Space  is 
discretized  with  a  grid  of  regular  spacing,  and  the  kinetic-energy  operator  is  expanded  up  to  sixth 
order  for  higher-order  finite-difference  expansion.  To  further  speed  up  the  convergence,  we  have 
used  a  multigrid  acceleration  scheme.  Atomic  forces  are  calculated  according  to  the  Hellmann- 
Feynman  theorem. 

We  have  also  implemented  an  0(N)  DFT  scheme  on  parallel  computers.  This  scheme 
computes  the  sparse  density  matrix  with  which  physical  quantities  such  as  the  energy  and  forces  are 
calculated.  This  algorithm  uses  higher-order  finite  differencing,  nonlocal  pseudopotentials, 
representation  of  the  density  matrix  with  local  nonorthogonal  orbitals,  and  spatial  decomposition  for 
parallelization.  We  have  implemented  the  0(N)  approach  for  GaAs  systems  containing  up  to 
22,500  atoms  on  1,024  T3E  nodes,  see  Fig.  17. 


Fig.  17:  Performance  of  linear-scaling  MD  and  QM  simulation  algorithms  on  1,024  Cray  T3E  processors: 
classical  MD  (circles);  environment-dependent,  variable-charge  MD  approach  (squares);  QM  scheme  based  on  the 
tight-binding  method  (triangles);  and  self-consistent  QM  scheme  based  on  the  density-functional  theory  (diamonds). 
The  conventional  0(N3)  density-functional  algorithm  is  also  shown  (crosses). 

§2.3  Hybrid  Molecular-Dynamics/Quantm-Mechanical  Simulation  Approach 

We  have  also  developed  scalable  software  infrastructure  to  enable  multiscale  simulations  of 
nanoelectronic  devices  using  MD  and  QM  methods,  see  Fig.  18. 

In  the  hybrid  QM/MD  simulation  scheme,  a  quantum  mechanical  system  described  by  the 
density  functional  theory  on  real-space  multigrids  is  embedded  in  a  classical  system  of  atoms 
interacting  via  an  empirical  interatomic  potential.  Handshake  atoms  coupling  the  quantum  and  the 
classical  systems  are  treated  by  a  novel  scaled  position  method.  The  scheme  is  implemented  on 
parallel  computers  using  both  task  and  spatial  decompositions.  An  application  to  oxidation  of  Si 
(100)  surface  demonstrates  seamless  coupling  of  the  quantum  and  the  classical  systems. 
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Fig.  18:  (Left)  Hybrid  QM/MD  simulation  of  an  02  molecule  (green) on  a  Si(001)  surface.  The  O  and  magenta  Si 
atoms  are  treated  quantum  mechanically.  Handshake  atoms  (yellow)  are  placed  between  the  QM  and  nearest-neighboi 
MD  (blue)  Si  atoms.  (Right)  Upon  the  dissociation  of  the  02  molecule,  enormous  heat  is  transmitted  through  the 
QM/MD  boundary  into  the  MD  region. 

§3  INTERACTIONS 

Below  we  briefly  describe  the  interactions  we  have  had  with  researchers  at  government  laboratory 
industry,  and  university. 

Argonne  National  Laboratory:  We  are  collaborating  with  a  number  of  scientists  at  the  DOE’s 
Argonne  National  Laboratory  in  Illinois.  One  of  our  students,  Alok  Chatteijee,  has  conducted 
neutron  scattering  measurements  of  structures  and  phonons  in  nanocluster-assembled  SiC  with  Dr. 
C.  Loong  at  the  Intense  Pulsed  Neutron  Source  Division.  We  are  also  collaborating  with  Drs.  T. 
Disz,  W.  Gropp,  R.  Stevens,  and  M.  Papka  on  designing  immersive  and  interactive visualizatioi 
tools,  parallel  algorithms,and  message-passing. 

NASA:  We  are  collaborating  with  Dr.  S.  Saini  at  NASA  Ames  on  algorithm  design  and 
implementationof  large-scalematerialssimulationson  NASA’s  Information  Power  Grid. 

Intel:  We  have  ongoing  collaboration  on  atomistic  simulation  of  electronic  devices  with  Dr.  S. 
Shankar’s  simulationand  modeling  group  at  Intel. 

Motorola:  Recently  we  have  started  collaboration  on  sintering  of  multilayered  ceramic  films  with 
Dr.  D.  Wilcox’s  experimentalgroup  at  CeramicTechnology  Center  at  Motorola. 

University  of  Southern  California:  We  have  been  collaborating  with  Dr.  A.  Madhukar’s  group 
at  the  University  of  Southern  Calif omiaon  intelligentmanufactuing  of  electronicdevices. 

§4  TRAINING  OF  GRADUATE  STUDENTS 

Our  graduate  students  are  enrolled  in  a  unique,  multi disciplinaryprogram  that  allows  them  to  obtain 
a  Ph.D.  in  physics  and  an  MS  from  the  Department  of  Computer  Science.  The  aim  of  this  program 
is  to  provide  students  with  broad-based  training  in  high  performance  computing  and 
communications  (HPCC)  and  the  physical  sciences.  In  connection  with  this  program,  we  have 
introduced  a  number  of  HPCC  courses  in  the  Physics  and  Computer  Science  Departmeits.  The 
Department  of  Physics  has  two  graduate  courses  in  computationalphysics  that  are  cross-listed  with 
computer  science  courses.  The  first  course  deals  with  classical  and  quantum  simulations  on  parallel 
architectures.  The  second  course,  designed  for  advanced  graduate  students,  covers  special  topics 
such  as  multiscale  phenomena,  multi  grid  methods,  wavelets,  etc.  In  the  Department  of  Computer 
Science,  we  have  introduced  three  new  HPCC  courses.  Soon  two  more  courses  will  be  added  to  the 
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computer-sciencecurriculum:a)  Grid  Computing;  and  b)  Scientific  Visualization.  The  courses  we 
have  introduced  emphasize  parallel  computing  and  algorithm  design  for  large-scale  scientific 
applications.  Students  have  access  to  a  number  of  parallel  machines  to  gain  hands-on  experience 
and  to  perform  research  on  large-scalecomputationalprojects. 

Our  students  have  excellent  opportunities  to  broaden  their  research  experience  beyond  the 
traditional  university  based  environment.  They  are  involved  in  our  collaborative  efforts  with 
computational  and  experimental  scientists  at  government  laboratories,  industries,  and  other 
universities.  These  interactions  have  significantly  enhanced  the  research  capabilities  of  students. 
Through  these  contacts,  students  also  have  access  to  excellentparallel  computing  and  visualization 
facilitiesat  other  institutions. 

One  of  the  dual-degree  students,  Dr.  Alok  Chatterjee^upervised  by  the  Pis  of  this  program 
receiveda  prestigious  Enrico  Fermi  Fellowship  from  the  US  Department  of  Energy  in  1999.  For 
the  developmentof  the  dual-degree  program,  Rajiv  K.  Kalia  and  Aiichiro  Nakano  have  received  a 
Distinguished  Faculty  Award  and  a  Faculty  ExcellenceAward, respectively  from  LSU  in  1999. 

§5  COMPUTATIONAL  FACILITIES 

For  the  research  program,  our  Concurrent  Computing  Laboratory  for  Materials  Simulations 
{CCLMS)  at  Louisiana  State  University  (LSU)  has  two  parallel  computing  laboratories,  one  in  the 
Department  of  Physics  and  Astronomy  and  the  other  in  the  Department  of  Computer  Science. 
With  $3  million  in  infrastructure  enhancement  grants  from  the  State  of  Louisiana,  these  labs  have 
been  equipped  with  several  parallel  machines  including:  Digital  Alpha  cluster— 64  Alpha 
processors  connected  via  Gigaswitches;  PC  cluster— 100  PCs  (550  MHz  Pentium  III,  to  be 
upgraded  to  a  164-node  PC  cluster  with  733  MHz  processors)  linked  by  fast  ethemet  switches 
(Fig.  19);  systolic  architecture— a  64-cell  Intel  i Warp. 


Fig.  19  :  (Left)  A  100-node  PC  cluster  at  the  CCLMS  .  (Right)  ImmersaDesk  virtual  environment  at  the  CCLMS . 

We  have  also  established  a  virtual  environment (VE)  laboratory  that  features  an  interactivi 
and  immersivelmmersaDesk  for  visualization(Fig.  19).  The  VE  lab  also  has  a  multiprocessor  SGI 
Onyx2/InfmiteReality2^n  Octane/MXE,and  an  8-processor  Power  Center  graphics  servers  as  well 
as  a  number  of  SGI  graphics  workstations.  Via  high-speed  networks,  the  ImmersaDesk  is  fully 
integrated  with  the  existing  parallel  machines  at  the  CCLMS  and  with  massively  parallel  computers 
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at  national  computer  centers.  (A  recent  NSF  high  performance  connectivity  award  to  LSU  allows 
access  to  high-speed  backbone  networks.) 

Simulations  have  also  been  performed  on  Cray  T3E,  SGI  Origin  2000,  and  IBM  SP 
computers  at  DoD’s  Major  Shared  Resource  Centers  and  a  Hewlett  Packard/Convex  Exemplar  at 
the  Center  for  Advanced  Computing  Research  at  Caltech. 
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